Clustering with Instance-level Constraints

نویسندگان

  • Kiri Wagstaff
  • Claire Cardie
چکیده

Clustering algorithms conduct a search through the space of possible organizations of a data set. In this paper, we propose two types of instance-level clustering constraints – must-link and cannot-link constraints – and show how they can be incorporated into a clustering algorithm to aid that search. For three of the four data sets tested, our results indicate that the incorporation of surprisingly few such constraints can increase clustering accuracy while decreasing runtime. We also investigate the relative effects of each type of constraint and find that the type that contributes most to accuracy improvements depends on the behavior of the clustering algorithm without constraints.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering Trees with Instance Level Constraints

Constrained clustering investigates how to incorporate domain knowledge in the clustering process. The domain knowledge takes the form of constraints that must hold on the set of clusters. We consider instance level constraints, such as must-link and cannot-link. This type of constraints has been successfully used in popular clustering algorithms, such as k-means and hierarchical agglomerative ...

متن کامل

Combining Data Clusterings with Instance Level Constraints

Recent work has focused the incorporation of a priori knowledge into the data clustering process, in the form of pairwise constraints, aiming to improve clustering quality and find appropriate clustering solutions to specific tasks or interests. In this work, we integrate must-link and cannot-link constraints into the cluster ensemble framework. Two algorithms for combining multiple data partit...

متن کامل

Instance-Level Constraints in Density-Based Clustering

Clustering data into meaningful groups is one of most important tasks of both artificial intelligence and data mining. In general, clustering methods are considered unsupervised. However, in recent years, so-named constraints become more popular as means of incorporating additional knowledge into clustering algorithms. Over the last years, a number of clustering algorithms employing different t...

متن کامل

SOM based clustering with instance-level constraints

This paper describes a new topological map dedicated to clustering under instance-level constraints. In general, traditional clustering is used in an unsupervised manner. However, in some cases, background information about the problem domain is available or imposed in the form of constraints, in addition to data instances. In this context, we modify the popular SOM algorithm to take these cons...

متن کامل

From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering

We present an improved method for clustering in the presence of very limited supervisory information, given as pairwise instance constraints. By allowing instance-level constraints to have spacelevel inductive implications, we are able to successfully incorporate constraints for a wide range of data set types. Our method greatly improves on the previously studied constrained -means algorithm, g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000